Back to Article
Article Notebook
Download Source

Changing Traffic Patterns in Washington, DC: Fallout from the COVID-19 Pandemic

PPOL 6805: GIS for Spatial Data Science - Final Project

Author

Holt Cochran

Introduction

Washington, DC is frequently listed as having some of worst traffic in the United States (Inrix, 2024). There are several factors that contribute to the large amount of traffic in the District: the large and dense population within the city limits packs people into a small land area; bottlenecks exist in certain areas of the city, such as the bridges crossing the Potomac River; drivers commuting from the sprawling suburban areas in Virginia and Maryland are funneled into tight corridors of the city; peak hours of the day, when people commute to and from work, escalate the traffic. Taken together, the District of Columbia experiences high rates of traffic, however there is no one single factor that contributes to the large build-up of drivers within the city - rather, it is a confluence of characteristics of the city and drivers that contribute to vehicle backups (Marks & Harrison, 2023).

When COVID-19 began spreading rapidly across the United States in March 2020, businesses, schools, and establishments closed to limit interactions and spread of the virus (Barrot et. al, 2024). People stayed home to shelter, and many functions of society were taken online in a remote setting. This caused an almost immediate stop in the number of people commuting into cities, including Washington, DC, for work, school, or other functions. The ability to work, learn, and exist in a remote capacity removed reasons for continuing to live in the same areas as before the pandemic - research shows that during the pandemic, large swaths of the US population relocated or moved to other areas (Frost, 2023).

Since the height of the COVID-19 lockdowns, society has reopened to in-person interactions, but remote functions of work, school, doctors visits, etc. still exist. My research for this project centers on how changes in human behavior during the COVID-19 pandemic have affected traffic patterns in Washington, DC and what city policymakers should do to address changes in traffic trends.

Hypothesis

Traffic has increased in Washington, DC since the reopening of schools, businesses, and return to office policies, but not to the same level as before the pandemic. However, do to mass relocation of people during the pandemic, traffic patterns and trends have changed in Washington, DC since COVID-19. Traffic has to become more spread out across the city, as people have moved and traffic overall has decreased, leading to new patterns in areas that did not previously have bad traffic. City policies need adjusting to address these changes and alleviate new traffic patterns and trends.

I anticipate that spatial autocorrelation is high before the pandemic, but drops off after the pandemic. This is due to more disperion in traffic, as people stay home more frequently and live farther outside of the city/in more residential areas possibly without large traffic patterns pre-pandemic.

Literature Review

Much of the research into how COVID-19 changed traffic patterns has to do with how lockdown and remote policies affected the volume of drivers on the road. In a paper titled “Impacts of Remote Work on Vehicle Miles Traveled and Transit Ridership in the USA” by Yunhan Zheng, Shenhao Wang, Lun Liu, Jim Aloisi & Jinhua Zhao, the research focuses on the the drop in drivers on roads generally. The paper finds that there was a “2.26% drop in metropolitan statistical area-level transit ridership” in 2022 compared with 2019. They conclude that there a fewer drivers on the roads in general, which changes routes that people who do drive take to get to their destinations (Zheng et. al, 2022)

In a slightly different research topic, researchers at the National Library of Medicine study how the changes in traffic patterns from COVID-19 have affected the number of vehicles on the road, which in turn affects the number of auto accidents. In “An Overview of the Impact of COVID-19 on Road Traffic Safety and Travel Behavior” by Ebrahim Shaik and Samsuddin Ahmed, the authors find that the number of drivers decreased since the start of the pandemic but there was a statistically significant increase in the number of serious accidents. They conclude that this is due to a number of factors listed in the paper, importantly among them is “travel objective” or destination and time spent driving of drivers on the roads (Shaik & Ahmed, 2022). This is possibly do to changes in remote policies of school and work, which are the same hypothesized proponents of traffic changes in my research. This is an interesting compliment to my research and hypothesis, which states that COVID-19 changed traffic behavior for people driving.

Methodology

To research changes in traffic patterns in Washington, DC, I am using data from District of Columbia Department of Transportation (DDOT) from 2018 - 2022 (District of Columbia Government, 2024). The data tracks traffic at a yearly level - these are reported in a statistic called Annual Average Daily Traffic (AADT), which is the average daily traffic a road experiences in a year. The unit of analysis are roads in the district, identified per road by a Route ID. Using data over time, I can examine traffic data in the two years before and after the height of the pandemic in 2020. Comparing traffic patterns over time will allow me to examine changes in trends, patterns, and quantities.

I first plot the traffic patterns on map to visually examine average traffic in DC by year. I then calculate Moran’s I for the traffic of roads for each year of data to assess spatial autocorrelation of traffic. Finally, I run Monte Carlo simulations to assess the statistical significance of the Moran I values. Taken together, these methods show if spatial autocorrelation of traffic exists or has changed over time in the city, as new roads have developed traffic patterns as a result of trends caused by the COVID-19 pandemic.

In [1]:
######## Load Packages ########
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
library(modelsummary)
Warning: package 'modelsummary' was built under R version 4.3.3
`modelsummary` 2.0.0 now uses `tinytable` as its default table-drawing
  backend. Learn more at: https://vincentarelbundock.github.io/tinytable/

Revert to `kableExtra` for one session:

  options(modelsummary_factory_default = 'kableExtra')
  options(modelsummary_factory_latex = 'kableExtra')
  options(modelsummary_factory_html = 'kableExtra')

Silence this message forever:

  config_modelsummary(startup_message = FALSE)
library(stargazer)

Please cite as: 

 Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
 R package version 5.2.3. https://CRAN.R-project.org/package=stargazer 
library(sf)
Warning: package 'sf' was built under R version 4.3.3
Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
library(spdep)
Warning: package 'spdep' was built under R version 4.3.3
Loading required package: spData
Warning: package 'spData' was built under R version 4.3.3
To access larger datasets in this package, install the spDataLarge
package with: `install.packages('spDataLarge',
repos='https://nowosad.github.io/drat/', type='source')`
library(gganimate)
library(ggplot2)
library(leaflet)
library(gifski)
Warning: package 'gifski' was built under R version 4.3.3
library(purrr)
library(sf)
library(dplyr)
library(leaflet)
library(leaflet.extras)
Warning: package 'leaflet.extras' was built under R version 4.3.3
library(RColorBrewer)
library(viridis)
Loading required package: viridisLite
library(spatstat)
Warning: package 'spatstat' was built under R version 4.3.3
Loading required package: spatstat.data
Warning: package 'spatstat.data' was built under R version 4.3.3
Loading required package: spatstat.univar
Warning: package 'spatstat.univar' was built under R version 4.3.3
spatstat.univar 3.1-1
Loading required package: spatstat.geom
Warning: package 'spatstat.geom' was built under R version 4.3.3
spatstat.geom 3.3-4
Loading required package: spatstat.random
Warning: package 'spatstat.random' was built under R version 4.3.3
spatstat.random 3.3-2
Loading required package: spatstat.explore
Warning: package 'spatstat.explore' was built under R version 4.3.3
Loading required package: nlme
Warning: package 'nlme' was built under R version 4.3.3

Attaching package: 'nlme'

The following object is masked from 'package:dplyr':

    collapse

spatstat.explore 3.3-3
Loading required package: spatstat.model
Warning: package 'spatstat.model' was built under R version 4.3.3
Loading required package: rpart
spatstat.model 3.3-3
Loading required package: spatstat.linnet
Warning: package 'spatstat.linnet' was built under R version 4.3.3
spatstat.linnet 3.2-3

spatstat 3.3-0 
For an introduction to spatstat, type 'beginner' 
library(htmltools)
library(spatialreg)
Warning: package 'spatialreg' was built under R version 4.3.3
Loading required package: Matrix

Attaching package: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack


Attaching package: 'spatialreg'

The following objects are masked from 'package:spdep':

    get.ClusterOption, get.coresOption, get.mcOption,
    get.VerboseOption, get.ZeroPolicyOption, set.ClusterOption,
    set.coresOption, set.mcOption, set.VerboseOption,
    set.ZeroPolicyOption
rm(list=ls()) # remove objects

# set the working directory
setwd("/Users/holtcochran/PPOL6805_final_project")

######## Import Data ######## 
dc_traffic_2018 <- st_read("Data/2018_Traffic_Volume.geojson", quiet=TRUE)
dc_traffic_2019 <- st_read("Data/2019_Traffic_Volume.geojson", quiet=TRUE)
dc_traffic_2020 <- st_read("Data/2020_Traffic_Volume.geojson", quiet=TRUE)
dc_traffic_2021 <- st_read("Data/2021_Traffic_Volume.geojson", quiet=TRUE)
dc_traffic_2022 <- st_read("Data/2022_Traffic_Volume.geojson", quiet=TRUE)


dc_traffic_2018 <- dc_traffic_2018 %>%
  filter(!is.na(AADT)) %>%
  filter(!st_is_empty(geometry)) %>%
  st_zm(drop = TRUE, what = "ZM")

dc_traffic_2019 <- dc_traffic_2019 %>%
  filter(!is.na(AADT)) %>%
  filter(!st_is_empty(geometry)) %>%
  st_zm(drop = TRUE, what = "ZM")

dc_traffic_2020 <- dc_traffic_2020 %>%
  filter(!is.na(AADT)) %>%
  filter(!st_is_empty(geometry)) %>%
  st_zm(drop = TRUE, what = "ZM")

dc_traffic_2021 <- dc_traffic_2021 %>%
  filter(!is.na(AADT)) %>%
  filter(!st_is_empty(geometry)) %>%
  st_zm(drop = TRUE, what = "ZM")

dc_traffic_2022 <- dc_traffic_2022 %>%
  filter(!is.na(AADT)) %>%
  filter(!st_is_empty(geometry)) %>%
  st_zm(drop = TRUE, what = "ZM")

Exploratory Data Analysis (EDA)

I first plot the traffic data on a map to explore trends in traffic patterns at a high level. Colors of the roads indicate the amount of Average Annual Traffic by road, which darker shades of red indicating areas of higher traffic. The map is displayed by year, allowing for comparison of traffic between years.

Figure 1. Average Annual Daily Traffic (AADT) in Washington, DC

In [2]:
######## Plot Data Over Time with Leaflet ######## 
# Preprocess the data for each year (2017-2023)
dc_traffic_list <- list()

# Iterate through each year to preprocess
for (year in 2018:2022) {
  data_name <- paste0("dc_traffic_", year)
  dc_traffic <- get(data_name)
  
  # Remove Z-dimension and transform to CRS 4326
  dc_traffic <- dc_traffic %>%
    mutate(geometry = st_zm(geometry)) %>%
    st_transform(crs = 4326)
  
  dc_traffic_list[[year]] <- dc_traffic
}

##### Leaflet Plot ##### 

# Define a moderately bright custom color palette
bright_colors <- c("#4CAF50", "#FFEB3B", "#FF9800", "#F44336", "#D32F2F")  # Bright green, yellow, orange, red, and maroon

# Assuming dc_traffic_list is a list of the traffic data for each year, e.g., dc_traffic_list[[2017]], dc_traffic_list[[2018]], etc.

# Initialize the leaflet map with base tiles
leaflet_map <- leaflet() %>%
  addProviderTiles(providers$CartoDB.Positron) %>%
  setView(lng = -77.0369, lat = 38.895, zoom = 12)  # Center the map on Washington, DC

# Loop through each year's data
for (year in 2018:2022) {
  dc_traffic <- dc_traffic_list[[year]]
  
  # Create a color scale based on the AADT values (no transformation applied)
  color_scale_custom <- colorNumeric(palette = bright_colors,  
                                     domain = range(dc_traffic$AADT, na.rm = TRUE))
  
  # Add the data to the leaflet map
  leaflet_map <- leaflet_map %>%
    addPolylines(data = dc_traffic, 
                 color = ~color_scale_custom(AADT),  # Use the AADT values directly
                 weight = 2, 
                 opacity = 0.6, 
                 group = as.character(year)) %>%
    addLayersControl(
      overlayGroups = as.character(2018:2022),
      options = layersControlOptions(collapsed = FALSE),
      position = "bottomright"
    )
}

# Add the legend to the map
leaflet_map <- leaflet_map %>%
  addLegend(
    position = "bottomleft",   # Position of the legend
    pal = color_scale_custom,   # Color scale to use
    values = range(dc_traffic$AADT, na.rm = TRUE),  # AADT range for the legend
    title = "AADT",
    opacity = 1
  )

# Display the map
leaflet_map

I then examine the amount of traffic at a more granular level for each year, allowing for comparisons of the percentage of high-traffic routes for each year. Figure 2 displays a histogram of AADT per year breaking down traffic into larger categories for comparison.

Figure 2.

In [3]:
# filter data
dc_traffic_2018_filtered <- dc_traffic_2018 %>%
  select(ROUTEID, FROMDATE, AADT, geometry)
dc_traffic_2019_filtered <- dc_traffic_2019 %>%
    select(ROUTEID, FROMDATE, AADT, geometry)
dc_traffic_2020_filtered <- dc_traffic_2020 %>%
  select(ROUTEID, FROMDATE, AADT, geometry)
dc_traffic_2021_filtered <- dc_traffic_2021 %>%
  select(ROUTEID, FROMDATE, AADT, geometry)
dc_traffic_2022_filtered <- dc_traffic_2022 %>%
  select(ROUTEID, FROMDATE, AADT, geometry)
  
dc_traffic_all_years <- bind_rows(
  dc_traffic_2018_filtered %>% mutate(year = 2018),
  dc_traffic_2019_filtered %>% mutate(year = 2019),
  dc_traffic_2020_filtered %>% mutate(year = 2020),
  dc_traffic_2021_filtered %>% mutate(year = 2021),
  dc_traffic_2022_filtered %>% mutate(year = 2022)
)


# Filter the dataset to include only roads with AADT >= 50,000
dc_traffic_all_years_filtered <- dc_traffic_all_years %>%
  filter(AADT >= 50000)

# Create bins for AADT with a bin width of 50,000
dc_traffic_all_years_filtered <- dc_traffic_all_years_filtered %>%
  mutate(AADT_bin = cut(
    AADT,
    breaks = seq(50000, max(AADT, na.rm = TRUE), by = 50000),  # Bins with a width of 50,000
    include.lowest = TRUE,
    right = FALSE,
    labels = paste0(
      round(seq(50000, max(AADT, na.rm = TRUE) - 50000, by = 50000) / 1000, 1), "k-", 
      round(seq(100000, max(AADT, na.rm = TRUE), by = 50000) / 1000, 1), "k"
    )
  ))

# Plot the histogram of AADT by year with the 50,000 bins
AADT_Histogram <- ggplot(dc_traffic_all_years_filtered, aes(x = factor(year), fill = AADT_bin)) +
  geom_bar(position = "dodge", width = 0.8) +  # Standardize bar width
  labs(title = "Annual Average Daily Traffic (AADT) by Year",
       x = "Year", y = "Frequency") +
  theme_minimal() +
  scale_fill_brewer(
    palette = "YlGnBu",
    name = "AADT Bins"
  ) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

AADT_Histogram

Moran’s I

Next I calculate the Moran’s I for the Average Annual Daily Traffic for each road with data for each year from 2018 - 2022. Moran’s I statistics are used to quantify spatial correlation and assess clustering of data. I use Moran’s I to assess if traffic in DC is spatially auto correlated, and the strength of the clustering of traffic on roads. To reduce computational complexity, I filter to only assess AADT > 50,000 - this only includes roads that have average traffic of more than 50,000 drivers a day. This focuses the analysis on roads with large quantities of drivers and traffic, which is the focus of this research.

In [4]:
library(knitr)
Warning: package 'knitr' was built under R version 4.3.3
library(kableExtra)

Attaching package: 'kableExtra'
The following object is masked from 'package:dplyr':

    group_rows
library(flextable)
Warning: package 'flextable' was built under R version 4.3.3

Attaching package: 'flextable'
The following objects are masked from 'package:kableExtra':

    as_image, footnote
The following objects are masked from 'package:spatstat.geom':

    border, rotate
The following object is masked from 'package:purrr':

    compose
# filter for AADT > 100000
dc_traffic_2018_50k <- dc_traffic_2018_filtered %>% filter(AADT > 50000)
dc_traffic_2019_50k <- dc_traffic_2019_filtered %>% filter(AADT > 50000)
dc_traffic_2020_50k <- dc_traffic_2020_filtered %>% filter(AADT > 50000)
dc_traffic_2021_50k <- dc_traffic_2021_filtered %>% filter(AADT > 50000)
dc_traffic_2022_50k <- dc_traffic_2022_filtered %>% filter(AADT > 50000)

dc_traffic_all_years_50k <- bind_rows(
  dc_traffic_2018_50k,
  dc_traffic_2019_50k,
  dc_traffic_2020_50k,
  dc_traffic_2021_50k,
  dc_traffic_2022_50k
)


# Convert road geometries to Simple Feature objects if not already
dc_traffic_2018_sf <- st_as_sf(dc_traffic_2018_50k)

dc_traffic_2019_sf <- st_as_sf(dc_traffic_2019_50k)
dc_traffic_2019_sf <- dc_traffic_2019_sf %>%
  filter(!st_is_empty(geometry)) 

dc_traffic_2020_sf <- st_as_sf(dc_traffic_2020_50k)
dc_traffic_2020_sf <- dc_traffic_2020_sf %>%
  filter(!st_is_empty(geometry)) 

dc_traffic_2021_sf <- st_as_sf(dc_traffic_2021_50k)
dc_traffic_2022_sf <- st_as_sf(dc_traffic_2022_50k)

# Compute pairwise distances between geometries (in meters or desired units)
dist_2018 <- st_distance(dc_traffic_2018_sf)
dist_2019 <- st_distance(dc_traffic_2019_sf)
dist_2020 <- st_distance(dc_traffic_2020_sf)
dist_2021 <- st_distance(dc_traffic_2021_sf)
dist_2022 <- st_distance(dc_traffic_2022_sf)

threshold <- units::set_units(100, "m")


# Define a function to create spatial weights matrix
create_weights_matrix <- function(distance_matrix, threshold) {
  # Convert distance matrix to binary weights based on the threshold
  weights_matrix <- ifelse(distance_matrix <= threshold, 1, 0)
  weights_matrix[lower.tri(weights_matrix, diag = TRUE)] <- 0  # Remove self-links
  return(weights_matrix)
}

# Create spatial weights matrices for each year
weights_2018 <- create_weights_matrix(dist_2018, threshold)
weights_2019 <- create_weights_matrix(dist_2019, threshold)
weights_2020 <- create_weights_matrix(dist_2020, threshold)
weights_2021 <- create_weights_matrix(dist_2021, threshold)
weights_2022 <- create_weights_matrix(dist_2022, threshold)


# Function to compute Moran's I
compute_morans_i <- function(aadt_values, weights_matrix) {
  # Create a spatial weight list object from the matrix with style "W" (binary)
  nb <- mat2listw(weights_matrix, style = "W", zero.policy = TRUE)  # Allow zero neighbors
  
  # Compute Moran's I
  moran_i <- moran.test(aadt_values, nb)
  
  return(moran_i$estimate[1])  # Moran's I value
}

# Calculate Moran's I for each year
moran_i_2018 <- compute_morans_i(dc_traffic_2018_sf$AADT, weights_2018)
Warning in mat2listw(weights_matrix, style = "W", zero.policy = TRUE):
neighbour object has 9 sub-graphs
moran_i_2019 <- compute_morans_i(dc_traffic_2019_sf$AADT, weights_2019)
Warning in mat2listw(weights_matrix, style = "W", zero.policy = TRUE):
neighbour object has 9 sub-graphs
moran_i_2020 <- compute_morans_i(dc_traffic_2020_sf$AADT, weights_2020)
Warning in mat2listw(weights_matrix, style = "W", zero.policy = TRUE):
neighbour object has 5 sub-graphs
moran_i_2021 <- compute_morans_i(dc_traffic_2021_sf$AADT, weights_2021)
Warning in mat2listw(weights_matrix, style = "W", zero.policy = TRUE):
neighbour object has 5 sub-graphs
moran_i_2022 <- compute_morans_i(dc_traffic_2022_sf$AADT, weights_2022)
Warning in mat2listw(weights_matrix, style = "W", zero.policy = TRUE):
neighbour object has 7 sub-graphs
# Create a data frame to compare Moran's I across years
moran_comparison <- data.frame(
  Year = 2018:2022,
  Moran_I = c(moran_i_2018, moran_i_2019, moran_i_2020, moran_i_2021, moran_i_2022)
)



moran_comparison <- data.frame(
  Year = as.character(2018:2022),
  Moran_I = c(moran_i_2018, moran_i_2019, moran_i_2020, moran_i_2021, moran_i_2022)
)

ft <- flextable(moran_comparison) %>%
  set_caption(caption = "Comparison of Moran's I Across Years") %>%
  colformat_double(j = "Moran_I", digits = 3) %>%
  autofit() %>%
  theme_box()

ft

Year

Moran_I

2018

0.769

2019

0.840

2020

0.824

2021

0.740

2022

0.783

Figure 3 displays Moran’s I comparison across years. There is peak Moran’s I value of 0.84 in 2019, with a clear dip to 0.74 in 2021.

Figure 3.

In [5]:
# Create a data frame to compare Moran's I across years
moran_comparison <- data.frame(
  Year = 2018:2022,
  Moran_I = c(moran_i_2018, moran_i_2019, moran_i_2020, moran_i_2021, moran_i_2022)
)

morans_time <- ggplot(moran_comparison, aes(x = Year, y = Moran_I)) +
  geom_line() +
  geom_point() +
  labs(title = "Moran's I Comparison Across Years", x = "Year", y = "Moran's I") +
  theme_minimal()
morans_time

Spatial Autoregressive (SAR) Model

I then create a spatial autoregressive (SAR) model to test the statistical significance of spatial autocorrelation of the traffic in Washingon, DC. I run a spatial regression on the Average Annual Daily Traffic for pre-pandemic years (2018-2019) and post-pandemic years (2021-2022). I exclude 2020 because this was when the COVID-19 pandemic spread across the world and lockdowns were implemented, so analyses could be skewed as people were forced to stay home.

Table 1 displays results from the pre-pandemic statistical model, while table 2 displays results from the post-pandemic model.

Table 1. Pre-Pandemic SAR Model (2018-2019)

In [6]:
set.seed(520)

# Pre-pandemic (2018-2019) data
pre_pandemic_data <- dc_traffic_all_years %>% filter(year %in% 2018:2019)
pre_pandemic_data <- st_as_sf(pre_pandemic_data)
pre_pandemic_data <- st_transform(pre_pandemic_data, crs = 6487)

# Sample data for pre-pandemic (if needed)
sampled_pre <- pre_pandemic_data %>%
  group_by(year) %>%
  slice_sample(n = 1000, replace = FALSE) %>%
  ungroup()

# Sample data for pre-pandemic (if needed)
sampled_pre <- pre_pandemic_data %>%
  group_by(year) %>%
  slice_sample(n = 1000, replace = FALSE) %>%
  ungroup()

# Calculate centroids for pre-pandemic data
coords_pre <- st_coordinates(st_centroid(sampled_pre))
Warning: st_centroid assumes attributes are constant over geometries
# Pre-pandemic k-nearest neighbors (adjust k as needed)
nb_knn_pre <- knn2nb(knearneigh(coords_pre, k = 5))  # Adjust k if needed
Warning in knearneigh(coords_pre, k = 5): knearneigh: identical points found
Warning in knearneigh(coords_pre, k = 5): knearneigh: kd_tree not available for
identical points
Warning in knn2nb(knearneigh(coords_pre, k = 5)): neighbour object has 3
sub-graphs
# Convert to spatial weights matrices (row standardization)
listw_pre <- nb2listw(nb_knn_pre, style = "W")

# Fit SAR model for pre-pandemic period
sar_model_pre <- lagsarlm(AADT ~ year, data = sampled_pre, listw = listw_pre)
Warning in lagsarlm(AADT ~ year, data = sampled_pre, listw = listw_pre): inversion of asymptotic covariance matrix failed for tol.solve = 2.22044604925031e-16 
  reciprocal condition number = 3.56977e-18 - using numerical Hessian.
# Summary of pre-pandemic model
model_result_pre <- summary(sar_model_pre)

# Extract the Rho value and its details
rho_value <- sar_model_pre$rho
rho_se <- sar_model_pre$rho.se

# Create a custom stargazer output
stargazer(sar_model_pre, type = "text", 
          title = "Spatial Autoregressive Model: Pre-Pandemic Traffic (2018 - 2019)",
          single.row = TRUE,
          ci = TRUE,
          add.lines = list(
            c("Rho", round(rho_value, 4)),
            c("Rho Std. Error", round(rho_se, 4))
          ))

Spatial Autoregressive Model: Pre-Pandemic Traffic (2018 - 2019)
============================================================
                             Dependent variable:            
                  ------------------------------------------
                                     AADT                   
------------------------------------------------------------
year                     166.322 (-196.947, 529.590)        
Constant          -329,056.600 (-1,062,312.000, 404,198.600)
------------------------------------------------------------
Rho                                 0.5426                  
Rho Std. Error                      0.0244                  
Observations                        2,000                   
Log Likelihood                   -22,291.750                
sigma2                         264,541,092.000              
Akaike Inf. Crit.                 44,591.500                
Wald Test                    494.007*** (df = 1)            
LR Test                      360.742*** (df = 1)            
============================================================
Note:                            *p<0.1; **p<0.05; ***p<0.01

Table 2. Post-Pandemic SAR Model (2021-2022)

In [7]:
set.seed(300)

# Post-pandemic (2020-2022) data
post_pandemic_data <- dc_traffic_all_years %>% filter(year %in% 2021:2022)

post_pandemic_data <- st_as_sf(post_pandemic_data)

post_pandemic_data <- st_transform(post_pandemic_data, crs = 6487)

# Sample data for post-pandemic (if needed)
sampled_post <- post_pandemic_data %>%
  group_by(year) %>%
  slice_sample(n = 1000, replace = FALSE) %>%
  ungroup()

coords_post <- st_coordinates(st_centroid(sampled_post))
Warning: st_centroid assumes attributes are constant over geometries
# Post-pandemic k-nearest neighbors
nb_knn_post <- knn2nb(knearneigh(coords_post, k = 5))  # Adjust k if needed
Warning in knearneigh(coords_post, k = 5): knearneigh: identical points found
Warning in knearneigh(coords_post, k = 5): knearneigh: kd_tree not available
for identical points
Warning in knn2nb(knearneigh(coords_post, k = 5)): neighbour object has 4
sub-graphs
listw_post <- nb2listw(nb_knn_post, style = "W")

# Fit SAR model for post-pandemic period
sar_model_post <- lagsarlm(AADT ~ year, data = sampled_post, listw = listw_post)
Warning in lagsarlm(AADT ~ year, data = sampled_post, listw = listw_post): inversion of asymptotic covariance matrix failed for tol.solve = 2.22044604925031e-16 
  reciprocal condition number = 7.5447e-18 - using numerical Hessian.
# Summary of post-pandemic model
# Summary of post-pandemic model
model_result_post <- summary(sar_model_post)

# Create a detailed model summary table

# Extract the Rho value and its details
rho_value <- sar_model_post$rho
rho_se <- sar_model_post$rho.se

# Create a custom stargazer output
stargazer(sar_model_post, type = "text", 
          title = "Spatial Autoregressive Model: Post-Pandemic Traffic (2021 - 2022)",
          single.row = TRUE,
          ci = TRUE,
          add.lines = list(
            c("Rho", round(rho_value, 4)),
            c("Rho Std. Error", round(rho_se, 4))
          ))

Spatial Autoregressive Model: Post-Pandemic Traffic (2021 - 2022)
====================================================================
                                 Dependent variable:                
                  --------------------------------------------------
                                         AADT                       
--------------------------------------------------------------------
year                       831.741*** (655.876, 1,007.605)          
Constant          -1,674,456.000*** (-2,029,966.000, -1,318,946.000)
--------------------------------------------------------------------
Rho                                     0.4279                      
Rho Std. Error                          0.0281                      
Observations                            2,000                       
Log Likelihood                       -22,006.760                    
sigma2                             204,162,100.000                  
Akaike Inf. Crit.                     44,021.510                    
Wald Test                        232.721*** (df = 1)                
LR Test                          192.492*** (df = 1)                
====================================================================
Note:                                    *p<0.1; **p<0.05; ***p<0.01

Discussion

The peak Moran’s I value in 2019 followed by a dip in 2021 suggests that spatial autocorrelation of traffic in Washington, DC decreases during and after the COVID-19 pandemic. The pre-pandemic values signify a high spatial autocorrelation, meaning that roads with similar traffic volumes (high or low AADT) were strongly clustered. This could indicate consistent spatial patterns, where certain areas had consistently high or low traffic volumes possibly driven by predictable commuting behavior, economic activity, and infrastructure.

The downward slope of Moran I values beginning in 2020 and continuing in 2021 display a decrease in spatial autocorrelation, suggesting that the clustering of roads with similar traffic volumes weakened. This is possibly due to changes in commuting behaviors as a result of COVID-19 remote or hybrid school and work policies, as people stay home or commute into areas of DC less frequently. Traffic patterns became less predictable and more spatially dispersed; roads in previously high-traffic areas may have experienced significant drops in traffic, while others (e.g., residential or suburban areas) saw increases.

The Rho value is also smaller in the post-pandemic period (0.4279) compared to the pre-pandemic period (0.5426). These values signify that post-pandemic traffic is about 25% less related to geospatial autocorrelation than pre-pandemic traffic. We can infer that in the years after the pandemic, traffic became more dispersed and less clustered spatially, suggesting a potential shift in traffic patterns—areas with high traffic were not as strongly influenced by neighboring areas with high traffic. Geospatial independence is more prominent in the post-pandemic traffic data, signifying that traffic is more spread out across the city.

Though this is not a direct causal statistical model, the methods used in this spatial autoregressive model are similar to methods of Granger Causality, which assess predictive causality in time-series data (Clarke & Granato, 2005) . Examining traffic data at a yearly level largely removes bias from data in the form of idiosyncratic events, such as a disruption in DC Metro service influencing the amount of traffic on a particular day. Taken with the timing of the COVID-19 pandemic as an event that separates the SAR models, these methods are close to those that allow causal or predictive inferences under Granger Causality.

Conclusion

Using Moran’s I for initial analysis and a Spatial Autoregressive Model for validation, the there is compelling and statistically significant evidence to suggest that traffic in Washington, DC is not as geospatially autocorrelated as before the COVID-19 pandemic. These findings suggest suggest that traffic patterns in Washington, DC have become more spread out since the COVID-19 pandemic in comparison to the years before 2020. Thee result of this is changed traffic volume in roads in the city that did not previously have traffic.

The implications of these new traffic patterns and volume are relevant for city policymakers as they address issues that dense traffic patterns present. This research serves as the first step in addressing a changed city and traffic behavior of its residents.

References

Inrix. (2024). Global Traffic Scorecard. https://inrix.com/scorecard-2022/

Marks & Harrison. (2023, September 14). Why is there so much traffic congestion in the Washington, D.C., Metro Area. https://www.marksandharrison.com/blog/traffic-congestion-in-washington-dc/#:~:text=The%20metro%20area’s%20population%20has,themselves%20safe%20while%20in%20transit.

Barrot et. al. (2024, February 9). Causal effects of closing businesses in a pandemic. Journal of Financial Economics. https://www.sciencedirect.com/science/article/pii/S0304405X24000175

Frost, R. (2023, March). Did More People Move During the Pandemic?. Harvard Joint Center for Housing Studies. https://www.jchs.harvard.edu/sites/default/files/research/files/harvard_jchs_pandemic_mobility_frost_2023.pdf

Zheng, Y., Wang, S., Liu, L. et al. “Impacts of remote work on vehicle miles traveled and transit ridership in the USA.” Nat Cities 1, 346–358 (2024). https://doi.org/10.1038/s44284-024-00057-1

Shaik, Ebrahim, and Samsuddin Ahmed. “An Overview of the Impact of COVID-19 on Road Traffic Safety and Travel Behavior.” Transportation Engineering, National Library of Medicine, Sept. 2022, pmc.ncbi.nlm.nih.gov/articles/PMC9123801/.

District of Columbia Government. (2024). Traffic Volume. Open Data DC. https://opendata.dc.gov/datasets/DCGIS::2022-traffic-volume/about?uiVersion=content-views

Clarke, H. D., & Granato, J. (2005). Granger-causality. Granger-Causality - An Overview. https://www.sciencedirect.com/topics/social-sciences/granger-causality